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Abstract 

The Middle East respiratory syndrome coronavirus (MERS-CoV) is an emerging virus that poses a 
major challenge to clinical management. 

The 3C-like protease (3CLP"°) is essential for viral replication and thus represents a potential 
target for antiviral drug development. Presently, very few data are available on MERS-CoV 3CL?'° 
inhibition by small molecules. We conducted extensive exploration of the pharmacophoric space 
of a recently identified set of peptidomimetic inhibitors of the bat HKU4-CoV 3CL°"°. HKU4-CoV 
3CLP'° shares high sequence identity (81%) with the MERS-CoV enzyme and thus represents a 
potential surrogate model for anti-MERS drug discovery. We used 2 well-established methods: 
Quantitative structure-activity relationship (QSAR)-guided modeling and docking-based compar- 
ative intermolecular contacts analysis. The established pharmacophore models highlight struc- 
tural features needed for ligand recognition and revealed important binding-pocket regions 
involved in 3CL"°-ligand interactions. The best models were used as 3D queries to screen the 
National Cancer Institute database for novel nonpeptidomimetic 3CL° inhibitors. The identified 
hits were tested for HKU4-CoV and MERS-CoV 3CL?” inhibition. Two hits, which share the 
phenylsulfonamide fragment, showed moderate inhibitory activity against the MERS-CoV 3CL?°° 
and represent a potential starting point for the development of novel anti-MERS agents. To the 
best of our knowledge, this is the first pharmacophore modeling study supported by in vitro val- 
idation on the MERS-CoV 3CL?"°. 


Highlights: 
e MERS-CoV is an emerging virus that is closely related to the bat HKU4-CoV. 
e 3CL"” is a potential drug target for coronavirus infection. 


e HKU4-CoV 3CL” is a useful surrogate model for the identification of MERS-CoV 3CLP'° 


enzyme inhibitors. 
e dbCICA is a very robust modeling method for hit identification. 


e The phenylsulfonamide scaffold represents a potential starting point for MERS coronavirus 


3CLP"° inhibitors development. 
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1 | INTRODUCTION 


Middle East respiratory syndrome coronavirus (MERS-CoV; HCoV- 
EMC/2012) is an emerging virus that causes severe pneumonia illness 
and exhibits a high mortality rate.* The first known human MERS-CoV 
cases occurred in Jordan in 2012, before the causative virus was 
detected and identified later during the same year in Saudi Arabia.?* 
Since then, over 1900 laboratory-confirmed cases have been reported 
to the WHO in 27 countries across the world.* 

MERS-CoV is an enveloped virus carrying a genome of positive 
sense RNA.° The virus, which is considered primarily as a zoonotic 
virus, belongs to the lineage C of Betacoronavirus, thus is closely 
related to the bat coronaviruses HKU4 and HKU5.°® Several studies 
have shown that bats and camels are the most likely animal reservoir 
of MERS-CoV.? +? Accumulating evidence points to virus transmission 
from dromedary camels to humans.?2:79 

As the case with many viral diseases, effective therapy against 
MERS is lacking and supportive care is the only available treatment 
Attempts 


MERS-CoV infection have led to promising results but are still in 


option. to develop an_ effective vaccine against 
early stages.1*7° The high morbidity and mortality rates of 
MERS-CoV as well as its potential to cause epidemics highlight 
the need for novel drug discovery to develop effective and safe 
anti-MERS-CoV therapeutics. 

Several efforts have been undertaken to identify selective potent 
small molecules with anti-MERS-CoV activity.1721 Promising 
compounds were identified via screening of FDA-approved drugs and 
drug-like small molecules using cell-based systems and in vitro 
screening.*”-4 

Targets homologous to those identified in the severe acute 
respiratory syndrome coronavirus (SARS-CoV) were investigated in 
MERS-CoV (reviewed in Hilgenfeld and Peiris?°).2°?? Among these, 
MERS-CoV main proteinase, also known as 3-chymotrypsin-like 
protease (3CL?'°), is considered an important potential target due to 
its essential role in the viral life cycle.2°?? The coronavirus genome 
encodes an 800-kDa replicase polyprotein, which is processed by the 
3CLP’° to yield intermediate and mature nonstructural proteins 
responsible for many aspects of virus replication.°°°%+ The enzyme 
started to attract interest as a target for anti-MERS-CoV drug 
development. However, data on the enzyme inhibition are scarce. 
The SARS-CoV 3CL?° has been comprehensively explored as a drug 
target, and many potent enzyme inhibitors have been identi- 
fied.:2°.3233 Elaborated structure- and ligand-based in silico models 
obtained using the SAR-CoV 3CL?"° inhibitors proved fruitless for the 
identification of MERS-CoV 3CL?° inhibitors (modeling studies 
conducted by our group, data not published). Interestingly, the 3CLP"° 
enzymes from different CoV strains are known to share significant 
sequence and 3D structure homology providing a strong structural 
basis for designing wide-spectrum anti-CoV inhibitors.2*%° Sequence 
alignment studies showed that the active site residues of the 
HKU4-CoV 3CL?"° that participated in inhibitor binding are conserved 
in the MERS-CoV 3CL"°, which has 81.0% sequence identity** to 
HKU4-CoV 3CL?° (Figure 1). Therefore, the bat HKU4-CoV 3CL?’° 
has been investigated as a surrogate model for anti-MERS 
development.?° Novel peptidomimetic inhibitors of MERS-CoV 3CL”° 


have been identified by using the enzyme from HKU4-CoV as a 
model.°¢ 

In this study, we used the set of peptidomimetic HKU4-CoV 
3CLP'° inhibitors reported in St. John et al°° to conduct extensive 
computational modeling studies. These modeling efforts aim at 
establishing pharmacophore models to be used as 3D search queries 
for virtual screening of potential MERS-CoV 3CL"° inhibitors. The 
methods used here were developed previously by our group: the 


QSAR-guided pharmacophore modeling?”** 


and the docking-based 
comparative intermolecular contacts analysis (dbCICA) pharmacophore 
modeling.°?° Both modeling approaches have been used successfully 
to identify potent inhibitors against several drug targets.°”“1 The 
identified hits were tested in vitro for their inhibitory activity against 


the 3CL°° enzymes from HKU4-CoV and MERS-CoV. 


2 | MATERIAL AND METHODS 


All chemicals and reagents were purchased from Sigma-Aldrich 


(United States), unless otherwise stated. 


2.1 | 
2.1.1. | 


The structures and biological data of 221 previously identified 
HKU4-CoV 3CL?"° inhibitors reported in St. John et al®° (1-221, 


Table S1) were used in modeling. 


QSAR-guided pharmacophore modeling 


Data preparation and pharmacophore exploration 


The bioactivities of these inhibitors were expressed as the 
concentration of the test compound that inhibited the activity of 
HKU4-CoV 3CLP by 50% (ICs5o, UM). In cases of unavailable ICs 
values (ie, 20-25 and 48-221, Table S1), the corresponding ICso 
estimates were extrapolated based on reported inhibitory percentages 
at 100M assuming linear dose-response relationships. The logarithms 
of measured ICsq (uM) values were used in QSAR-guided 
pharmacophore modeling to correlate bioactivity data linearly to free 
energy change. Chiral centers with unknown configuration were 
marked as “unknown” so that the inversion these chiral centers is 
sampled during conformation generation. 

These compounds were used to explore the pharmacophoric 
space of HKU4-CoV 3CL"” through a series of established modeling 
steps as has been described previously.°°*7"* The modeling workflow 
is detailed in Sections S1 to S5. 


2.1.2 | QSAR modeling 


QSAR-guided selection of optimal pharmacophores was conducted 
to find an optimal combination of pharmacophore models capable 
of explaining bioactivity variation across the whole set of collected 


training compounds (1-221, Table S$1).°° 


QSAR modeling was done 
using the genetic function algorithm (GFA) to generate combinations 
of descriptors (physicochemical and pharmacophores) (Sections S6 
and S7). Subsequently, multiple linear regression (MLR) analyses 
were used to assess the qualities of selected descriptor combina- 
tions, ie, to explain bioactivity variations within collected inhibitors. 
This QSAR modeling was performed using a training set of 177 


compounds of the total set of HKU4-CoV 3CL?° inhibitors and 
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(A) 


FIGURE 1 


Recognition 


Comparison of the binding site of 3CL?'° from HKU4-CoV and MERS-CoV. (A) A ribbon presentation of the superimposition of the 


HKU4-CoV 3CLP'° complex with a potent inhibitor (blue ribbons and green carbon atoms, 1.8 A, PDB code 4YOl) and the MERS-CoV enzyme 
(red ribbons and gray carbon atoms, 2.1 A, PDB code 4YLU), showing the high similarity in protein folding and a close-up view of the main residues 
interacting with inhibitors in HKU4-CoV and MERS-CoV 3CL?"° binding pockets. The figure was prepared using the DS visualizer. (B) Amino acid 
sequence alignment of the 3CL?'° from HKU4-CoV and MERS-CoV enzyme. The sequence alignment was generated by using Clustal Omega. 
Residues strictly conserved have a red background; similar residues are indicated by black bold letters with a yellow background according to a 
Risler matrix implemented in ESPript. The symbols above the sequence correspond to the secondary structure of MERS-CoV3CL" (PDB code 
AYLU; Tomar et al°°). The blue stars indicate residues in the binding pocket the enzymes. MERS-CoV, Middle East respiratory syndrome 


coronavirus; PDB, Protein Data Bank 


validated using leave-one-out r* (r,90) and predictive r (r’press) 
against a randomly selected testing set of 44 inhibitors as described 
in Sections S6, S7, and S8. The test set was selected by ranking the 
total 221 inhibitors according to their ICs59 values, and then every 
fifth compound was selected for the testing set starting from the 


high-potency end. 


2.2 | Docking-based comparative intermolecular 
contacts analysis 


Docking studies were performed using a subset of 27 compounds of 
the peptidomimetic HKU4-CoV 3CL?° inhibitors with known 
(absolute) stereochemistries (1-27, Table $1). The 3D coordinates of 
HKU4-CoV 3CL"° were retrieved from the Protein Data Bank (PDB 
code: 4YOI, 1.8 A).°° The protein structure was modified by adding 
hydrogen atoms and Gasteiger-Marsili charges to the protein atoms 
using the Discovery Studio (version 2.5.5; Accelrys Inc, San Diego). It 
was then used in subsequent docking experiments without energy 
minimization. 

Docking was conducted using both LibDock*” and CDOCKER.*® 
LibDock is a site-feature docking algorithm that docks ligands (after 
removing hydrogen atoms) into an active site guided by binding 
hotspots.?” While, CDOCKER is a CHARMm-based simulated 


annealing/molecular dynamics method that implements simulated 
annealing to search for the most stable docked ligand poses.*® These 
docking engines consider the flexibility of the ligand while treat the 
receptor as rigid. Details of each docking engine and the corresponding 
docking settings are described in Sections S9 to S10. The highest- 
ranking docked conformers/poses were scored using 7 scoring 
functions: Jain, LigScore1, LigScore2, PLP1, PLP2, PMF, and PMFO4 
(Section $11).47°°? The docking-scoring cycles using both engines were 
repeated to cover all possible docking combinations resulting from the 
presence (or absence) of crystallographically explicit water molecules 
within the binding site. 

Taking into account each scoring function in turn, the highest 
scoring docked conformer/pose of each inhibitor was chosen to be 
used in subsequent comparative intermolecular contacts analysis 
(dbCICA) modeling.2?*° This step resulted in 7 docking/scoring 
combinations of the 27 compounds each of them scored with a 
corresponding scoring function. The docking and scoring cycle was 
repeated 2 times to cover all combinations of docking conditions, ie, 
the presence or absence of explicit water molecules. The resulting 14 
docking/scoring sets were used in dbCICA modeling as described 
previously.2”*° Sections S12 to $13 describe details of dbCICA 
modeling. Successful dbCICA models were used to guide the manual 


building of pharmacophores (Section $14). 


ABUHAMMAD €7 AL. 


4of15 | _Molecular 

WILEY Recognition 
2.3 | Validation and steric refinement of 
pharmacophore models 


Optimal pharmacophores (both structure and ligand based) were 
validated using the receiver operating characteristic (ROC) curve 
analysis to assess the ability of each model to correctly classify a group 
of compounds into actives and inactives (Section $15).3740.54 
Matthews correlation coefficient (MCC) was also undertaken as an 
additional validation.°° Additionally, exclusion spheres were added 
using HIPHOP-REFINE module of Discovery Studio to improve the 


ROC properties of QSAR-guided pharmacophore (Section $8). 


2.4 | Virtual screening for new HKU4-CoV 3CL?° 
inhibitors 


The selected pharmacophores were used as 3D search queries to 
screen the National Cancer Institute (NCI) database>° for new 3CL?° 
inhibitors. 

Hits captured by the QSAR-guided pharmacophore were filtered 
by the Lipinski criteria to ensure good pharmacokinetic properties?” 
and the SMILES arbitrary target specification (SMARTS) filter (Section 
$16) to remove reactive ligands (ie, alkyl halides or Michael 
acceptors).°? Remaining hits were fitted against the corresponding 
individual pharmacophores. The fit values were then substituted in 
the MLR-based QSAR models to predict hits' bioactivities (-log(ICso)). 
The highest-ranking hits were selected for in vitro testing using a 
voting system to minimize the influence of QSAR-based predictions 
on hit prioritization. In this system, each hit fit value and the hit's 
overall QSAR predictions cast a vote of “one” if the value is within 
the top 20% of all captured hits, otherwise the vote is “zero.” 

Similarly, hits captured from all successful dbCICA-derived 
pharmacophores were pooled together and filtered according to the 
Lipinski criteria?” and SMARTS filter.°° The hits were then docked into 
HKU4-CoV 3CL"° binding pocket (4YOI) using the same docking/ 
scoring conditions of each successful dbCICA model. The resulting 
docked poses were then analyzed for critical contacts (according to 
successful dbCICA models), and the sums of critical contacts for each 
hit compound were used for the prediction of their corresponding 
ICs values. The highest-ranking hits were selected for in vitro testing 
using a similar voting system to that described above: Each docking 
solution casts a vote of “one” if the predicted value is within the top 


10% of all captured hits, otherwise it casts a vote of “zero.” 


2.5 | Protein expression and purification 


MERS-CoV 3CL°’° was expressed through auto-induction in 
Escherichia coli BL21-DE3 cells in the presence of 100 ug/mL of 
carbenicillin as described previously.°°°? Cells were harvested by 
centrifugation at 5000g for 20 minutes at 4°C, and the pellets were 
stored at -80°C until further use. MERS-CoV 3CL"° purification was 
performed using consecutive steps of hydrophobic-interaction 
chromatography, DEAE anion-exchange chromatography, Mono S 
cation-exchange chromatography, and size-exclusion chromatography 
as described previously.2° HKU4-CoV 3CL°'° was produced and 
purified using a modified protocol from Agnihothram et al.°° Final 


protein yield was calculated based on the measurement of total 


activity units (uM product/min), specific activity (units/mg), and 
milligrams of protein obtained (BioRad protein assay) after each 


chromatographic step. 


2.6 | Inhibition assays 


Inhibition assays were conducted as described previously.2° Each of 
the acquired hits was screened for inhibition of HKU4 3CL?'° and 
MERS 3CL"° at a concentration of 40uM in duplicate assays 
containing the following assay buffer (50mM HEPES, 0.1 mg/mL 
BSA, 0.01% TritonX-100, 2mM DTT). Compound 1 (the most potent 
compound in the training set; Table $1; St. John et al°*#'¢ 14) was 
used as a positive control. The assays were conducted in Costar 
3694 EIA/RIA 96-Well Half Area, Flat Bottom, Black Polystyrene 
plates (Corning, New York). A total of 1 uL of 100X inhibitor stock 
in dimethyl sulfoxide (DMSO) was added to 79 ul of enzyme in 
assay buffer, and the enzyme-inhibitor mixture was incubated for 
5 minutes. The reaction was initiated by the addition of 20 uL of 
10uM UIVT3 substrate, a custom synthesized Forster resonance 
energy transfer substrate peptide with the following sequence: 
HilyteFluor 488-ESATLQSGLRKAK-QXL520-NH>, producing final 
concentrations of 250nM HKU4-CoV 3CL°'°, 500nM MERS-CoV 
3CLP°, and 100uUM UIVT3 substrate. The fluorescence intensity of 
the reaction was then measured over time as relative fluorescence 
units (RFU;) for a period of 10 minutes, using an excitation wave- 
length of 485 nm and bandwidth of 20 nm and monitoring emission 
at 528 nm and bandwidth of 20 nm using a BioTek Synergy H1 mul- 
timode microplate reader. The inhibition of the HKU4-CoV 3CL°° 
and MERS-CoV 3CL?"° by hit compounds was monitored by follow- 
ing the change in RFUs over time, using the initial slope of the prog- 
ress curve to determine the initial rate (V;). The percent inhibition of 


each 3CL?° enzyme was determined using the following equation: 


(Inhibited SCL? RFU/s-BackgroundRFU/s) 


%Inhibition = | 1- x100. 


(Uninhibited3CL?° RFU/s~BackgroundRFU/s) 
(1) 


The ICs values were determined at an ambient temperature from 
100-uL assays performed in triplicate in the following buffer: 50mM 
HEPES, 0.1 mg/mL BSA, 0.01% TritonX-100, 2mM DTT. Kinetic assays 
were conducted in Costar 3694 EIA/RIA 96-Well Half Area, Flat 
Bottom, Black Polystyrene plates (Corning, NY). Each inhibitor was 
tested at concentrations ranging from 2.54uM to 400uM. A total of 
1 uL of 100X inhibitor stock in DMSO was added to 79 uL of enzyme 
in assay buffer, and the enzyme-inhibitor mixture was incubated for 
5 minutes. The reaction was initiated by the addition of 20 uL of 
10uM UIVTS3 substrate, producing final concentrations of 250nM 
HKU4-CoV 3CL?"°, 500nM MERS-CoV 3CL?°, and 2uM UIVT3 
substrate. The fluorescence intensity of the reaction was then 
measured over time as RFU; for a period of 10 minutes, using an 
excitation wavelength of 485 nm and bandwidth of 20 nm and 
monitoring emission at 528 nm and bandwidth of 20 nm using a BioTek 
Synergy H1 multimode microplate reader. The percent inhibition of the 
3CLP'° enzymes was then plotted as a function of inhibitor concentra- 


tion. The SigmaPlot Enzyme Kinetics Wizard was used to fit the 
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triplicate percent inhibition data and associated standard error to a 
nonlinear Michaelis-Menten type regression model and determine 


the ICs for each enzyme using the following equation: 


: a 
Siinhibition = ee tani 


ICso + [Inhibitor] 


where %lmax is the percent maximum inhibition of 3CL°° and the error 
in ICs59 values was determined as the error in the fitted parameter. 
Controls were performed, in which the enzyme, the substrate, or 
both was/were omitted. Fluorescence attenuation controls were 
carried by adding the inhibitors to the cleaved substrate in a reaction 


mixture identical to that used in the inhibition assays. 


3 | RESULTS AND DISCUSSION 


3.1 | Ligand-based approach: QSAR-guided 
pharmacophore modeling 


The pharmacophoric space of 221 HKU4-CoV 3CL"" inhibitors was 
extensively explored through 112 HYPOGEN automatic runs per- 
formed on 14 carefully selected training subsets comprising 14 to 
22 compounds (Section 2.1 and Tables S1 and $2). The training 
compounds in each subset were selected in such a way to ensure 
that each set represent a common binding mode and guarantee that 
bioactivities differences among its members are attributable to the 
presence or absence of pharmacophoric features. Applying this 
strategy allows an effective exploration of the pharmacophoric 
space of HKU4-CoV 3CL?"° inhibitors and helps to identify 
pharmacophoric hypotheses representing all possible binding modes 
assumed by 3CL?'°.984246 These runs resulted in 677 successful 
pharmacophore models, which were then clustered using the hierar- 
chical average linkage method available in CATALYST. The best 68 
representative models were used in subsequent QSAR modeling 
(Section 2.1). 


(A) 


1 
= 
an 
nN 


Predicted LE 


@ Training 
A Testing 


-0.12 


Experimental LE 


Recognition 


The fit values obtained by mapping the 68 representative 
pharmacophores against the HKU4-CoV 3CL?° inhibitors were 
enrolled together with a selection of 2D descriptors as independent 
variables in QSAR analysis. 

Genetic function algorithm combined with MLR analyses was used 
to select different combinations of pharmacophores and 2D molecular 
descriptors that are capable of explaining bioactivity variation among 
collected inhibitors. 

However, all attempts to achieve statistically successful QSAR 
models failed, prompting the use of ligand efficiency [LE = —log(ICs50)/ 
heavy atom count] as an alternative response variable instead of 
-log(ICso).6* 4 The best QSAR models are summarized in Equations 3 
and 4. Figure 2A, B show the corresponding scatter plots of experi- 


mental versus estimated bioactivities for training and testing inhibitors. 


LE= -0.12 + 1.98x10-3(AromaticBonds) + 5.95x10°*(Dipole) 
-1.22x10°3(Dipolex) -6.64x10~ (DipoleY) 
-9.7x10 *(LUMO) + 2.22x10°3[Hypo(K-T5-3)| 
+4.73x10°?[Hypo(L-T5-2)] 
n= 177,1? = 0.637, F-statistic = 42.408, r?100 = 0.572, r? press = 0.675. 


(3) 


LE = -0.11 + 1.99x10°9(AromaticBonds)-9.53x10~*(Dipolex) 

-6.58x10"“(DipoleY)-9.30x10°*(LUMO) 

+4.89x10° [Hypo(L-T5-2)+-2.39x10*Hypo(N-T1-1)| 

177, r? = 0.625, F-statistic = 47.298, 17,00 = 0.584, r7 press = 0.647. 
(4) 


n= 


where n is the number of training compounds used to generate this 
Fisher the 


cross-validation correlation coefficient, and r’press is the predictive 7 


equation, F_ is statistic, L090 is leave-one-out 
determined for 44 randomly selected test compounds. AromaticBonds 
is the number of aromatic bonds in the molecule, Dipole, Dipolex, 
and DipoleY are dipole moment descriptors that indicate the 
strength and orientation behavior of a molecule in an electrostatic 
field, LUMO is the energy of the lowest unoccupied molecular 
orbital,°? Hypo(L-T5-2), Hypo(K-T5-3), and Hypo(N-T1-1) represent 


the fit values of the training compounds against corresponding 


(B) 


7 
0.12 -0.1 -0.08 -0.06 -0.04 -“0%2 


Predicted LE 


@ Training 
A Testing 


-0.12 


Experimental LE 


FIGURE 2. Experimental versus predicted bioactivities for the training and testing compounds. Predicted bioactivities calculated using the best 
QSAR models: (A) Equation 3 and (B) Equation 4. The solid line is the regression line for the fitted and predicted bioactivities of training and 
test compounds, respectively, whereas the dotted lines indicate arbitrary error margins. 
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pharmacophores (see Table S3). Figure 3 shows the 3 pharmacophores 
and how they fit the most 
(1, ICs9 = 0.33uM**). 

The appearance of AromaticBonds descriptor combined with 
positive slopes in both QSAR equations indicates that HKU4-CoV 
3CL? inhibitory activity is directly proportional to the number of 


potent training compound 


aromatic rings in the inhibitor structure. This is to be expected, as 
the binding pocket is rich in aromatic amino acids (His41, His166, 
His175, Tyr54, and Phe143). Apparently, ligands' aromatic rings stack 
against these aromatic residues in the binding pocket is likely to lead 
to a high binding affinity. However, the emergence of several dipole 
moment descriptors (Dipole, DipoleX, and DipoleY) combined with 


positive and negative regression coefficients in Equations 3 and 4 is 


suggestive of an obscure role of ligands' dipole moments in binding 
within the enzyme-binding pocket. 

The emergence of LUMO in Equations 3 and 4 combined with 
negative slopes suggests that ligand/HKU4-CoV 3CL”” affinity favors 
electrophilic ligands, perhaps due to a m-stacking against certain 
electron-rich aromatic centers in the binding pocket (eg, the aromatic 
rings of Tyr54 and Phe143). 

The emergence of 3 pharmacophores—Hypo(K-T5-3), Hypo 
(N-T1-1), and Hypo(L-T5-2)—in Equations 3 and 4 suggests possible 
multiple or complementary binding modes exhibited by ligands within 
the binding pocket. Receiver operating characteristic analysis of the 3 
pharmacophores shows that Hypo(K-T5-3) and Hypo(N-T1-1) are sig- 
nificantly superior to Hypo(L-T5-2) (Table 1). Furthermore, MCC of the 


FIGURE 3. Pharmacophoric features of the QSAR-guided pharmacophores and the corresponding merged model: green-vectored spheres: HBA; 
blue spheres: Hbic; purple-vectored spheres: HBD; and orange-vectored spheres: RingArom, (A) Hypo(N-T1-1), (B) Hypo(K-T5-3), (C) Merged- 
Hypo(K-T5-3/N-T1-1), (D) Refined Merged-Hypo(K-T5-3/N-T1-1), and (E) Hypo(L-T5-2) fitted against co-crystallized ligand within HKU4-CoV 
3CL®° (compound 1, ICso = 0.33uM, PDB code 4YOI, 1.8 A). (F) Ligand co-crystallized within HKU4-CoV 3CL""° and the chemical structure of the 
co-crystallized ligand. Arrows point to closely positioned common features in Hypo(N-T1-1) and Hypo(K-T5-3) allowing for merging. The 3D 
coordinates of these pharmacophores are shown in Table S6. HBA, hydrogen bond acceptor; HBD, hydrogen bond donor 
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TABLE 1 ROC and MCC performances of QSAR-guided 
pharmacophores 


Pharmacophore Model ROC-AUC ACC SPC TPR MCC 


Hypo(L-T5-2) 0.78 0.09 0.05 1.00 0.048 
Hypo(K-T5-3) 0.78 0.52 0.50 0.74 0.099 
Hypo(N-T1-1) 0.81 0.63 0.63 0.63 0.109 
Hypo(K-T5-3/N-T1-1) 0.93 0.88 0.90 0.52 0.263 


Refined Hypo(K-T5-3/N-T1-1) 0.94 0.89 0.91 0.48 0.262 


Abbreviations: ACC, overall accuracy; AUC, area under the curve; MCC, 
Matthews correlation coefficient; ROC, receiver operating characteristic; 
SPC, overall specificity; TPR, overall true positive rate. 


3 pharmacophores reflects the very weak classification abilities of 
Hypo(L-T5-2) (Table 1). 

The very poor classification power of Hypo(L-T5-2) prompted us 
to exclude it from subsequent modeling efforts. However, Hypo(K- 
T5-3) and Hypo(N-T1-1) (Figure 3A,B) have 3 pharmacophoric features 
in common: hydrophobic (Hbic), ring aromatic (RingArom), and hydro- 
gen bond acceptor (HBA) features. The close resemblance between 
these 2 pharmacophores combined with their equivalent contributions 
to bioactivity (as indicated by their slopes in QSAR Equations 3 and 4) 
suggest that they might represent a common binding mode assumed 
by ligands within the HKU4-CoV 3CL""° binding pocket. Therefore, 
these 2 pharmacophores were merged in a single binding model 
(Hypo(K-T5-3/N-T1-1) (Figure 3). 

Interestingly, Hypo(K-T5-3/N-T1-1) showed noticeable improve- 
ment in distinguishing actives from decoys as indicated by the ROC 
analysis and MCC values (Table 1). Merging pharmacophores that 
share common features has been reported to improve the perfor- 
mance of pharmacophores in capturing active molecules.°° Addition- 
ally, Hypo(K-T5-3/N-T1-1) was further modified by adding exclusion 
spheres (Section S8 and Table S6) to further enhance its ROC profile 
(Table 1). Exclusion volumes resemble inaccessible regions within the 
binding site. Figure 3D shows the sterically refined version of 
Hypo(K-T5-3/N-T1-1) complemented with eight exclusion volumes. 

Moreover, Hypo(K-T5-3/N-T1-1) maps the most potent ligand 1 
(ICs59 = 0.33uM) in a way that closely resembles the interactions 
observed in the co-crystallized structure of the same compound with 
HKU4-CoV 3CL?'° (4YOI) (Figure 3). The close proximity between 
the ligand's thiophenoyl moiety and the sulfide of Met25 (Figure 3F) 
suggests the presence of a mutual hydrophobic interaction, which 
correlates with mapping the same ring against a Hbic feature in 
Hypo(K-T5-3/N-T1-1) (Figure 3C). Similarly, mapping the carbonyl 
of the same thiophenoy! moiety against HBA feature in Hypo(K-T5-3/ 
N-T1-1) (Figure 3C) agrees with the hydrogen bonding interaction 
connecting this carbonyl to the thiol of Cys145 (Figure 3F). Likewise, 
the hydrogen bonding interaction connecting the amidic NH of the 
ligand to the peptidic carbonyl of His41 via bridging water molecule 
agrees with mapping the same NH against hydrogen bond donor 
(HBD) features in Hypo(K-T5-3/N-T1-1) (Figure 3F). Mapping the 
ligand's benzotriazole ring against RingArom feature in Hypo(K-T5-3/ 
N-T1-1) (Figure 3C) correlates with stacking this ring system against 
the peptide amide connecting Cys145 and Leu144 in the binding 
pocket (Figure 3F). Finally, the hydrogen bonding interaction anchoring 
the ligand's tertiary amide carbonyl to the peptide NH of Glu169 
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corresponds to fitting the same carbonyl against HBA feature in 
Hypo(K-T5-3/N-T1-1) (Figure 3C). These findings showed that 
Hypo(K-T5-3/N-T1-1) represents a valid binding mode exhibited by 
the ligands within the binding pocket of HKU4-CoV 3CL°"°. These 
interactions, highlighted by the pharmacophoric features within this 


model, are very likely to be critical for ligand-binding affinity. 


3.2 | Structure-based approach: dbCICA modeling 


Structure-based pharmacophore models for HKU4-CoV 3CL°'° were 
obtained by using dbCICA. In this approach, a subset of inhibitors 
(1-27, Table S1) were docked into the HKU4-CoV 3CL? binding 
pocket using LibDock,*” and CDOCKER® (Section 2.2). The highest- 
ranking conformers/poses based on each scoring function were 
aligned together to construct a corresponding dbCICA model. Genetic 
algorithm was then used to search for the best combination of ligand- 
receptor intermolecular contacts capable of explaining bioactivity 
variation across the training compounds. Table 2 shows the contacts 
distance thresholds, number of positive and negative contacts, and 
statistical criteria of the best dbCICA models. Table 3 shows the critical 
binding site contact atoms proposed by optimal dbCICA models. The 
highest-ranking dbCICA models exhibited excellent statistical criteria 
and were anticipated to act as good templates for building correspond- 
ing pharmacophore models (Table 2). Figure 4 shows how dbCICA 
model SB-1 (Tables 2 and 3) was converted into its corresponding 
pharmacophore model Hypo(SB-1) as an example. The emergence of 
significant positive contact atoms at Pro45 and HOH225 (Figure 4A) 
combined with the consensus among potent docked ligands to 
position hydrophobic alkyl, cycloalkyl, or aromatic rings nearby 
(within 3.5 A from Pro45 and HOH225, Figure 4C) prompted us to 
place Hbic feature onto these ligand groups (Figure 4D). It is likely that 
hydrophobic fragments of the ligands interact with the side chain of 
Ala4é6. 

Similarly, the emergence of the amidic NH of Gln192 as significant 
positive contact in SB-1 combined with agreement among docked 
potent training compounds on placing their central benzene rings near 
to this contact suggested placing an Hbic feature onto these benzene 
ligand fragments. Clearly, these rings are involved in hydrophobic 
interaction with the nearby thiol of Cys145 instead of m-stacking 
(as the nearest aromatic amino acid residue is His41 at about 4.5 A 
away). This explains our decision to place Hbic feature onto this region 
of the ligands (ie, rather than RingArom feature). 

Likewise, the appearance of His166 and HOH241 as positive 
contact points combined with agreement among potent hits to 
position their benzotriazoles close by suggested placing a hydropho- 
bic aromatic (HbicArom) feature onto these benzotriazole moieties 
(Figure 4E). The reason for adding an HbicArom feature onto these 
rings instead of a vectored RingArom feature is because the 
benzotriazoles, although docked near to the imidazole of His166, it 
did not exhibit typical m-stacking alignment with this residue. In 
contrast, the appearance of positive contacts at His41 and ASP190 
combined with a consensus among docked potent inhibitors to pro- 
ject their thiophene rings close to the nearby imidazole of His41 
suggests a mutual m-stacking interaction involving the electron-rich 


ligands' thiophenes and electron-deficient His41 imidazole. This 
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TABLE 2. The highest ranking dbCICA models and their corresponding parameters and statistical criteria® 

Model Docking Engine Scoring Function Positive Contacts? Negative Contacts° Page Fies® (scent F statistic 
SB-1 CDOCKER PMF 9 10 0.92 0.91 0.91 291.39 
SB-2 CDOCKER PMF 5 5) 0.88 0.86 0.83 180.4 
SB-3 LibDock PLP2 5 10 0.90 0.88 0.87 221.48 
SB-4 LibDock PLP2 8 5 0.91 0.89 0.89 239.61 
SB-5 LibDock Lig2 5 S 0.86 0.84 0.84 147.68 


Abbreviation: dbCICA, docking-based comparative intermolecular contacts analysis. 


*All successful models listed herein were generated by docking the ligands into the binding site in the presence of crystalographically explicit water mole- 
cules and at ligand/binding site contact distance thresholds of 3.5 A (Section $12). 


>Optimal number of combined (ie, summed) bioactivity-enhancing ligand/binding site contacts. 
“Optimal number of bioactivity-disfavoring ligand/binding site contacts. 

4Non-cross-validated correlation coefficient for 27 training compounds. 

©Cross-validation correlation coefficients determined by the leave-one-out technique. 


‘Cross-validation correlation coefficients determined by the leave-20%-out technique repeated 5 times. 


TABLE 3 Critical binding site contact atoms proposed by optimal dbCICA models 


Favored Contact Atoms 
(Positive Contacts)” 


dbCiCA Amino acids and Disfavored Contact Atoms 
Model? atom identities© Weights? (Negative Contacts)° 
ASP190:CB 2 CYS145:CB; CYS145:HB2; GLN167:0; GLN192:HA; GLN192:HG1; 


LEU144:C; LEU144:HD22; MET168:SD; HOH216:H1; HOH234:H1 


SB-1 CYS145:HB1 
GLN192:HE21 
GLU169:HN 
HIS166:NE2 
HIS41:CB 
PRO45:CA 
HOH225:H1 
HOH241:0 


SB-2 PRO45:CA 
ASP190:0 
GLU169:0E1 
HIS166:NE2 
PHE143:C 


SB-3 ASP190:C 
HIS194:HN 
MET168:HB2 
PHE143:CA 
SER24:HB2 


SB-4 ASP190:C 
HIS41:HD2 
LEU144:Ha 
MET168:HB2 
MET168:SD 
PHE143:C 
THR193:N 
HOH217:0 


SB-5 ALA46:CB 
ASP190:C 
PHE143:0 
PRO52:HG2 
HOH401:H1 


LEU144:C; LYS191:HN; MET168:SD; MET25:SD; CYS145:HG 


CYS44:HB1; CYS44:HB2; GLN195:HB1; HIS41:0; LYS191:C; LYS191:HN; 


MET25:CG; MET25:N; PRO52:HD1; HOH116:H1 


GLN192:CD; GLU169:0; LEU49:CG; LEU49:HB2; MET168:HE2 


ASP190:CB; CYS44:HB2; GLN167:0; HIS175:CD2; THR193:C 


WWONRPN NWPNWNWW NWWRPH NPWWP WHWORPRPWNNE 


Abbreviation: dbCICA, docking-based comparative intermolecular contacts analysis. 
*As in Table 2. 
>Bioactivity-proportional ligand/binding site contacts. 


“Binding site amino acids and their atomic contacts. Atom codes are as provided by the PDB file except for hydrogen atoms, which were coded by Discovery 
Studio. 


‘Degree of significance (weight) of corresponding contact atom. It points to number of times it emerged in the final dbCICA model (see Section $12). 


“Bioactivity-disfavoring ligand/binding site contacts. 


ABUHAMMAD €T7 AL. 


Journal of 


WI LEy—Molecular | 9 of 15 


(A) PROS (B) PROGS 
om? ° ° 
HOH22Sy = HOH22S5 = 

ALAM : ALAM 
. > ae . 
: MET2S, 
ASP190 
MET2S ° 
GLNI92. 
cysus 2 GLNI67 cysias 
METI6S 
5+] ° 
HOD of L ‘aiuies HOHM of” \ 
none O HISI66 
HOH2IG 
LEUI4 oe LEUIa4 
( HOH? 
(D) ° PROMS (E) ° PROMS 
HOH225, _ HOH = 
ALASG ALAS 
. HIS4! be <i . 
MET2$ 


om 6 
Poser ae 


— ey 
HOHITS 


ers 


ws ae) q 


Leto 
ites. 


HOH241 


oO. 
HOH21 


Recognition 
( ) +) PROES 
He 22S, = 
ALAW : 
oo A . HIs4! Que 
ok” MET2S orm 
GLNI92 bof 
° GLNI67 GLNIG? 
cysias 
METI68 ° Maries 
‘ \ 
GLULE9 HOH2s «ff ‘ ‘GLUI6S 
Horse & 
LEUIa4 oO 
( HOH241 
(F) PROS 
° 
ON225y - 
ALA 
HIS41 : . HEstt : 
° or” ° or 
METS 
GLNI92 
cysis : Or Nie?) 
te) METIGS 
om, OY 
HOHE te: " ‘onuies 


SY VESTED HISI66 
none SS” 

Lats 
‘Jes 


HOHDME 


FIGURE 4 Steps used in the manual generation of binding model Hypo(SB-1) as guided by dbCICA model SB-1 (Tables 2 and 3): (A) The 
binding site moieties selected by dbCICA model SB-1 with significant contact atoms shown as spheres. (B) The docked pose of the potent 
training compound 3 (ICs59 = 1.2uM) within the binding pocket. (C) The docked poses of the potent compounds 3, 4, 5, 6, and 8. (D) Manually 
placed pharmacophoric features onto chemical moieties common among docked potent compounds 3, 4, 5, 6, and 8. (E) The docked pose of 3 
and how it relates to the proposed pharmacophoric features. (F) Exclusion spheres fitted against binding site atoms showing negative 
correlations with bioactivity (dbCICA model SB-1). Green vectored spheres: HBA, blue spheres: Hbic, violet spheres: HbicArom, and orange- 
vectored spheres: RingArom. Exclusion spheres are shown in gray. dbCICA, docking-based comparative intermolecular contacts analysis; HBA, 


hydrogen bond acceptor 


observation supported placing a RingArom feature onto the thio- 
phene rings. 

The emergence of positive contact on the amidic NH of GLN169 
and agreement of docked compounds on placing their central amide 
oxygen close to the NH of GLN169 indicated the presence of 
hydrogen bonding interaction and suggested placing HBA feature onto 
the ligand amidic carbonyl groups (Figure 4E). This interaction is very 
likely to involve hydrogen bonding with the peptide amidic NH of 
GLU169. 

Finally, all contacts points of negative correlation with bioactivity 
were assumed to represent areas of steric clashes with the bound 
ligand. Therefore, such contacts were used to define exclusion 
volumes within the vicinity of the binding pocket, as shown in 
Figure 4E. 

The same strategy was used to translate all other optimal dbCICA 
models in Tables 2 and 3 into their corresponding pharmacophore 
models (Figure 5). The X, Y, and Z coordinates of the resulting 
pharmacophores are shown in Table S7. Subsequent validation using 
ROC analysis (Table 4).illustrated the excellent classification powers 
of these pharmacophores in distinguishing actives from decoys. 


Matthews correlation coefficient values indicate that the structure- 


based dbCICA models are superior in their classification ability to the 


QSAR-guided pharmacophores. 


3.3 | In silico screening 


The QSAR-guided, sterically refined, merged pharmacophore Hypo(K- 
T5-3/N-T1-1) and 5 dbCICA-based pharmacophores (Hypo(SB-1) to 
Hypo(SB-5)) were used as 3D search queries to screen the NCI virtual 
database for small molecule inhibitors of 3CL°'°. Captured hits were 
filtered by the Lipinski criteria?” and SMARTS filter as described?® in 
Section 2.4. 

The QSAR-guided hits fitted against component 
pharmacophores (ie, Hypo(K-T5-3), Hypo(N-T1-1), and Hypo(L-T5-2)) 
and their fit values were substituted in MLR-QSAR Equations 3 and 
4 to predict their bioactivities. The top 39 compounds (of the 


were 


highest-ranking hits; prioritized using the voting system described in 
Section 2.4) that were available in the NCI Open Chemicals Repository 
were acquired for in vitro testing. 

On the other hand, filtered dbCICA-derived hits were docked into 


HKU4-3CL?° protein using the same docking conditions of each 
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FIGURE5 dbCICA pharmacophores derived 
from successful dbCICA models in Tables 2 
and 3. (A) Hypo(SB-1) mapped against training 
compounds 5 and 6 (ICs9 = 1.5uM and 1.6uM, 
respectively, (Table $1), (B) Hypo(SB-2) 
mapped against 5 and 6, (C) Hypo(SB-3) fitted 
against 5, (D) Hypo(SB-4) mapped against 6, 
and (E) Hypo(SB-5) mapped against 5. Green 
vectored spheres: HBA, purple-vectored 
spheres: HBD, blue spheres: Hbic, violet 
spheres HbicArom, and orange-vectored 
spheres: RingArom. Exclusion spheres are 
shown in gray. dbCICA, docking-based 
comparative intermolecular contacts analysis; 
HBA, hydrogen bond acceptor; HBD, 
hydrogen bond donor 


successful dbCICA model (SB1, SB-2, SB-3, SB-4, and SB-5, Tables 2 
and 3) to predict their corresponding inhibitory IC59 values (Section 
2.4). The hits were ranked and prioritized using the voting system 
described in Section 2.4, and the top 39 compounds were acquired 
for in vitro testing. Thus, the total of 78 compounds from the NCI 


Open Chemicals Repository were acquired for testing. 


3.4 | 


A total of 78 NCI (Figure $1), 39 QSAR-guided derived hits and 39 


dbCICA derived hits, compounds were acquired and screened in vitro 


TABLE 4 ROC and MCC performances of the dbCICA-based 
pharmacophores 


In vitro validation 


Pharmacophore Model ROC-AUC ACC SPC TPR MCC 


Hypo(SB-1) 0.946 0.495 0.726 0.815 0.241 
Hypo(SB-2) 0.976 0.632 0.944 0.666 0.713 
Hypo(SB-3) 0.932 0.573 0.854 0.666 0.283 
Hypo(SB-4) 0.971 0.615 0918 0.666 0.384 
Hypo(SB-5) 0.897 0.425 0.611 0.963 0.254 


Abbreviations: ACC, overall accuracy; AUC, area under the curve; MCC, 
Matthews correlation coefficient; ROC, receiver operating characteristic; 
SPC, overall specificity; TPR, overall true positive rate. 


to determine their inhibitory activity against HKU4-CoV-3CL° and 
MERS-CoV-3CL? at 40uM hit concentration. The 3CLPT° enzyme 
assay used in this study was carefully designed to avoid misleading 
false positives and to prevent wasted follow-up on promiscuous 
compounds (by adding albumin, DTT, and triton-100 to the reaction 
mixture). Tables S8 and S9 show the %inhibition against 3CL? of 
the hits captured by the QSAR-guided and the dbCICA derived 
pharmacophores, respectively. 

Only a single compound (NCI code 134140) of the 39 tested hits, 
captured by the QSAR-guided pharmacophores, showed inhibitory 
activity 250% against both HKU4-CoV 3CL?'° and MERS-CoV 3CL?"°. 
However, this compound has a molecular fragment known to cause 
pan assay interference (PAINS-like; Baell°”) and therefore was not con- 
sidered as a hit in further characterizations. Three compounds of the 
same ligand-based hits (NCI codes: 12156, 22906, and 28562; Table 
S8) showed unexpectedly high negative values of their activity against 
MERS-CoV 3CL?”° (-633.2%, -203.4%, and -662.6% at 40uM; Table 
S8). Several controls were performed in which either the substrate or 
the enzyme or both were omitted from the assay (data not shown). 
None of these hits showed evidence of fluorescence interference. It 
might be possible that these compounds act as activators of the 


enzyme. However, further evidence is still needed to support this 
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hypothesis. It was previously observed that designed reversible 
peptidomimetic inhibitors acted as activators at a low compound con- 
centration as a result of induced dimerization.°° Therefore, these 3 hits 
will not be discussed in the current publication. 

Only a single compound (222; NCI code 120178) of the dbCICA 
derived hits showed inhibitory activity 250% (at 40uM) against 
MERS-CoV 3CL?'° (51.9%; Table S9 and Figures 5). This activity is 
comparable to that of the positive control against the MERS-CoV 
enzyme (compound 1; 63.8% Figure 6). However, the compound 222 
failed to show significant inhibitory activity against HKU4-CoV 3CLP"°. 
The purity of 222 was confirmed using nuclear magnetic resonance 
and mass spectroscopy (Figure $2). Another compound, 223, was 
found to exhibit a bit lower activity against the MERS-CoV enzyme 
(28% inhibition at 40uM). The purity of 223 was confirmed using 
nuclear magnetic resonance and mass spectroscopy (Figure S3). 
Compounds 222 and 223 (NCI code 128947) share a common 
phenylsulfonamide fragment, which is amenable to chemical 


modifications. Both compounds were captured by Hypo(SB-3) and 


Code 1 (positive control) 


% inhibition at 40 uM 
(MERS-CoV 3CL?) 


Apparent ICeq (uM) 
(MERS-CoV 3CL?) 


63.821 


ND 
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Hypo(SB-5) pharmacophores (Table 4). Figure 7 shows how 222 hit 
maps the dbCICA pharmacophore models. 

Further controls were conducted (same as described above) to 
rule out fluorescence interference. None of these hits showed 
significant flourescence in the assay buffer (no enzyme and no 
substrate), in the presence of the enzyme (no substrates) or in the 
presence of the substrate (no enzyme) (data not shown). However, 
at concentrations >100uUM, 222 showed approximately 10% 
attenuation of the cleaved substrate fluorescence (data not shown). 
Both 222 and 223 showed moderate apparent ICs q values against 
the MERS-CoV 3CL?’° of 98.7uM and 131.1uM, respectively 
(Figure 5). The shape of the activity curve of compound 222, where 
a linear inhibition of fluorescence up to a maximum inhibition, 
indicates the influence of the inner filter effect (Figure $4).°°°? Inner 
filter effect is one of the major challenges usually encountered in 
FRET-based enzyme assays.°” 

The low hit rate observed in this study can be justified by the 
limited availability of many of the top-ranked hits in the NCI Open 


HO NO, 
222 223 
51.947 28+7 
98.7 + 6.0 131.1448 


Hill, n=1.7+0.1 Hill, n=2.820.2 


FIGURE 6 The chemical structures, inhibitory activities, and apparent ICso values of the positive control 1, and the 2 tested hits captured by the 
dbCICA-derived pharmacophores (222 and 223). dbCICA, docking-based comparative intermolecular contacts analysis 


(B) 


(Cc) 


(D) $ (E) 


FIGURE 7 


(F) 


dbCICA-based pharmacophores derived from successful dbCICA models (Tables 2 and 3) mapped against hit compound 222. (A) Hypo 


(SB-1), (B) Hypo(SB-2), (C) Hypo(SB-3), (D) Hypo(SB-4), and (E) Hypo(SB-5). Green-vectored spheres: HBA, purple-vectored spheres: HBD, blue 
spheres: Hbic, violet spheres: HbicArom, violet spheres: HbicArom, and orange-vectored spheres: RingArom. Exclusion spheres are shown in gray. 
dbCICA, docking-based comparative intermolecular contacts analysis; HBA, hydrogen bond acceptor; HBD, hydrogen bond donor 
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Chemicals Repository and hence limited number of tested hits 
(only 78 hits). 

There was also a limitation in the availability of published potent 
MERS-CoV 3CL?° inhibitors to be used as training set in modeling 
enzyme inhibition. Obviously, the prediction ability of the computa- 
tional models is very much dependent on the compounds used in 
modeling. The training compounds used in the current study are all 
peptide-like compounds, and only 14% of them exhibited ICs5o values 
<10uM. The effect of the starting training set was prominent on the 
ligand-based modeling (QSAR-guided model), hence, explaining the 
poor quality of these models as indicated by their low MCC values. 
Clearly, the quality of the training set is a pivotal factor in 
determining the predictive validity of the obtained pharmacophores. 
It is also worth noting that the active site-directed design of 
nonpeptidomimetic small molecule inhibitors of proteases is often 
challenging because of the unique chemistry of the peptide-bond 
cleavage transition state and because some proteases cleave their 


substrates through an induced fit mechanism.”° 


4 | CONCLUSIONS 


Recently, special attention has been paid to bat coronaviruses. Two 
deadly emerging coronaviruses, which have caused unexpected human 
disease outbreaks, SARS-CoV and MERS-CoV, are suggested to be 
originated from bats. MERS-CoV is now considered a threat to global 
public health. While its human-to-human transmission is so far limited, 
serious concerns over its pandemic potential have been raised. 
Therefore, there is an urgent need for the development of effective 
and safe anti- MERS-CoV treatment. 

In this study, we have explored the pharmacophoric space of the 
recently identified peptidomemic HKU4-3CL?° inhibitors?® by 2 
independent approaches; the QSAR-guided pharmacophore modeling 
and the dbCICA-based pharmacophore construction. Both approaches 
have successfully resulted in the identification of novel potent 
inhibitors on a wide variety of targets. QSAR-guided pharmacophore 
modeling is a ligand-based method, in which pharmacophores are 
derived by extensive exploration of the 3D space of a carefully 
selected variable small subset of the inhibitors. These pharmacophores 
are then allowed to compete within the context of classical QSAR 
using GFA and MLR to identify combinations that result in finest 
estimation of the bioactivities. dbCICA modeling, on the other hand, 
is a structure-based pharmacophore construction method, which relies 
on the accurate selection of the most successful docking/scoring 
conditions combinations. The success criterion is the ability of the 
docking run to align potent ligands in a way that would allow them 
to form contacts unattainable by low-potency ligands. dbCICA can be 
considered a 3D QSAR that correlates ligands' affinities to their 
contacts with certain binding site spots by using GFA and MLR. 
Successful dbCICA models can then be translated into binding models 
(pharmacophores) to be used as in silico screening tools of virtual 
databases. 

We have applied these robust computational methods to model 
HKU4-CoV 3CLP'° inhibitors as a tool to identify inhibitors of 
MERS-CoV 3CL?'°. These models assisted the identification of 2 hit 


compounds with moderate apparent activity against MERS-CoV 
3CL?'°. The identified inhibitors share a novel nonpeptidomimetic 
scaffold that is amenable to medicinal chemistry optimization efforts. 
Despite the fair inhibitory activity of this scaffold, it represents a 
potential starting point in the discovery of novel MERS-CoV antivirals. 
There are several successful examples in the history of drug discovery 
in which the starting hits showed low-to-moderate enzyme inhibition. 
For example, the millimolar inhibitor Neu5Ac was the starting point in 
the development of zanamivir, the first influenza neuraminidase 
inhibitor introduced to the market.”4 

Most importantly, the established ligand-based and structure- 
based pharmacophore models aid as tools for advancing our 
understanding of small molecule recognition of the coronavirus 3CL?°° 
enzymes. The pharmacophores obtained by modeling the HKU4-CoV 
3CLP'° inhibitors revealed structural features needed for potent 3CLP° 
enzyme inhibitors design. While, dbCICA models (structure-based 


pro 


models) highlighted potential hot-spot regions in the 3CL'° pocket 
that could be targeted using small nonpeptidomimetic molecules. Such 
knowledge is valuable for the successful development of 3CL?‘° 


inhibitors as anti-MERS drugs. 
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